Search CORE

208 research outputs found

Sampled Weighted Min-Hashing for Large-Scale Topic Mining

Author: AZ Broder
DM Blei
G Fuentes Pineda
G Salton
GE Hinton
O Chum
YW Teh
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 07/09/2015
Field of study

We present Sampled Weighted Min-Hashing (SWMH), a randomized approach to automatically mine topics from large-scale corpora. SWMH generates multiple random partitions of the corpus vocabulary based on term co-occurrence and agglomerates highly overlapping inter-partition cells to produce the mined topics. While other approaches define a topic as a probabilistic distribution over a vocabulary, SWMH topics are ordered subsets of such vocabulary. Interestingly, the topics mined by SWMH underlie themes from the corpus at different levels of granularity. We extensively evaluate the meaningfulness of the mined topics both qualitatively and quantitatively on the NIPS (1.7 K documents), 20 Newsgroups (20 K), Reuters (800 K) and Wikipedia (4 M) corpora. Additionally, we compare the quality of SWMH with Online LDA topics for document representation in classification.Comment: 10 pages, Proceedings of the Mexican Conference on Pattern Recognition 201

arXiv.org e-Print Archive

Crossref

Powerpropagation: A sparsity inducing weight reparameterisation

Author: Jayakumar SM
Latham PE
Pascanu R
Schwarz J
Teh YW
Publication venue: NeurIPS
Publication date: 01/01/2021
Field of study

The training of sparse neural networks is becoming an increasingly important tool for reducing the computational footprint of models at training and evaluation, as well enabling the effective scaling up of models. Whereas much work over the years has been dedicated to specialised pruning techniques, little attention has been paid to the inherent effect of gradient based training on model sparsity. In this work, we introduce Powerpropagation, a new weight-parameterisation for neural networks that leads to inherently sparse models. Exploiting the behaviour of gradient descent, our method gives rise to weight updates exhibiting a “rich get richer” dynamic, leaving low-magnitude parameters largely unaffected by learning. Models trained in this manner exhibit similar performance, but have a distribution with markedly higher density at zero, allowing more parameters to be pruned safely. Powerpropagation is general, intuitive, cheap and straight-forward to implement and can readily be combined with various other techniques. To highlight its versatility, we explore it in two very different settings: Firstly, following a recent line of work, we investigate its effect on sparse training for resource-constrained settings. Here, we combine Powerpropagation with a traditional weight-pruning technique as well as recent state-of-the-art sparse-to-sparse algorithms, showing superior performance on the ImageNet benchmark. Secondly, we advocate the use of sparsity in overcoming catastrophic forgetting, where compressed representations allow accommodating a large number of tasks at fixed model capacity. In all cases our reparameterisation considerably increases the efficacy of the off-the-shelf methods

UCL Discovery

Vector-valued Gaussian Processes on Riemannian Manifolds via Gauge Equivariant Projected Kernels

Author: Borovitskiy V
Deisenroth MP
Hutchinson M
Takao S
Teh YW
Terenin A
Publication venue: NeurIPS Proceedings
Publication date: 07/12/2021
Field of study

Gaussian processes are machine learning models capable of learning unknown functions in a way that represents uncertainty, thereby facilitating construction of optimal decision-making systems. Motivated by a desire to deploy Gaussian processes in novel areas of science, a rapidly-growing line of research has focused on constructively extending these models to handle non-Euclidean domains, including Riemannian manifolds, such as spheres and tori. We propose techniques that generalize this class to model vector fields on Riemannian manifolds, which are important in a number of application areas in the physical sciences. To do so, we present a general recipe for constructing gauge equivariant kernels, which induce Gaussian vector fields, i.e. vector-valued Gaussian processes coherent with geometry, from scalar-valued Riemannian kernels. We extend standard Gaussian process training methods, such as variational inference, to this setting. This enables vector-valued Gaussian processes on Riemannian manifolds to be trained using standard methods and makes them accessible to machine learning practitioners

UCL Discovery

Stacked Capsule Autoencoders

Author: Hinton GE
Kosiorek AR
Miss Jo STAFFORD-TOLLEY
Sabour S
Teh YW
Publication venue
Publication date: 01/01/2019
Field of study

Objects are composed of a set of geometrically organized parts. We introduce an unsupervised capsule autoencoder (SCAE), which explicitly uses geometric relationships between parts to reason about objects. Since these relationships do not depend on the viewpoint, our model is robust to viewpoint changes. SCAE consists of two stages. In the first stage, the model predicts presences and poses of part templates directly from the image and tries to reconstruct the image by appropriately arranging the templates. In the second stage, SCAE predicts parameters of a few object capsules, which are then used to reconstruct part poses. Inference in this model is amortized and performed by off-the-shelf neural encoders, unlike in previous capsule networks. We find that object capsule presences are highly informative of the object class, which leads to state-of-the-art results for unsupervised classification on SVHN (55%) and MNIST (98.7%). The code is available at https://github.com/google-research/google-research/tree/master/stacked_capsule_autoencodersComment: NeurIPS 2019; 14 pages, 7 figures, 4 tables, code is available at https://github.com/google-research/google-research/tree/master/stacked_capsule_autoencoder

arXiv.org e-Print Archive

Oxford University Research Archive

Effectiveness and resource requirements of test, trace and isolate strategies for COVID in the UK

Author: Elesedy B
Harling G
He B
Hutchinson M
Johnson AM
Paleyes A
Teh YW
Zaidi S
Publication venue: 'The Royal Society'
Publication date: 24/03/2021
Field of study

We use an individual-level transmission and contact simulation model to explore the effectiveness and resource requirements of various test-trace-isolate (TTI) strategies for reducing the spread of SARS-CoV-2 in the UK, in the context of different scenarios with varying levels of stringency of non-pharmaceutical interventions. Based on modelling results, we show that selfisolation of symptomatic individuals and quarantine of their household contacts has a substantial impact on the number of new infections generated by each primary case. We further show that adding contact tracing of non-household contacts of confirmed cases to this broader package of interventions reduces the number of new infections otherwise generated by 5–15%. We also explore impact of key factors, such as tracing application adoption and testing delay, on overall effectiveness of TTI

UCL Discovery

The Mondrian Kernel

Author: Balog M
Ghahramani Z
Lakshminarayanan B
Roy DM
Teh YW
Publication venue: 32nd Conference on Uncertainty in Artificial Intelligence (UAI 2016)
Publication date: 01/01/2016
Field of study

We introduce the Mondrian kernel, a fast

\textit{random feature}

approximation to the Laplace kernel. It is suitable for both batch and online learning, and admits a fast kernel-width-selection procedure as the random features can be re-used efficiently for all kernel widths. The features are constructed by sampling trees via a Mondrian process [Roy and Teh, 2009], and we highlight the connection to Mondrian forests [Lakshminarayanan et al., 2014], where trees are also sampled via a Mondrian process, but fit independently. This link provides a new insight into the relationship between kernel methods and random forests.Gatsby Charitable Foundation, Alan Turing Institute, Google, Microsoft Research and Engineering and Physical Sciences Research Council (Grant ID: EP/N014162/1), NSERC (Discovery Grant), European Research Council under the European Union’s Seventh Framework Programme (FP7/2007-2013) (Grant agreement no. 617071

arXiv.org e-Print Archive

Oxford University Research Archive

Apollo (Cambridge)

MPG.PuRe

How Robust are the Estimated Effects of Nonpharmaceutical Interventions against COVID-19?

Author: Brauner J
Chindelevitch L
Gal Y
Gavenciak T
Kulveit J
Leech G
Mindermann S
Sharma M
Stephenson A
Teh YW
Publication venue: NeurIPS
Publication date: 01/01/2020
Field of study

To what extent are effectiveness estimates of nonpharmaceutical interventions (NPIs) against COVID-19 influenced by the assumptions our models make? To answer this question, we investigate 2 state-of-the-art NPI effectiveness models and propose 6 variants that make different structural assumptions. In particular, we investigate how well NPI effectiveness estimates generalise to unseen countries, and their sensitivity to unobserved factors. Models which account for noise in disease transmission compare favourably. We further evaluate how robust estimates are to different choices of epidemiological parameters and data. Focusing on models that assume transmission noise, we find that previously published results are robust across these choices and across different models. Finally, we mathematically ground the interpretation of NPI effectiveness estimates when certain common assumptions do not hold

arXiv.org e-Print Archive

Spiral - Imperial College Digital Repository

Explore Bristol Research

Two-Way Parsimonious Classification Models for Evolving Hierarchies

Author: B Yu
CN Silla Jr
D Diermeier
E Zavitsanos
F Sebastiani
G Hirst
H-S Oh
HP Luhn
P Ogilvie
YW Teh
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2016
Field of study

Crossref

International Migration, Integration and Social Cohesion online publications

UvA-DARE

Location Dependent Dirichlet Processes

Author: A Oliva
A Rodríguez
C Bishop
C Williams
CE Rasmussen
D Blei
D Blei
D Dunson
E Sudderth
F Zhu
H Ishwaran
J Duan
J Griffin
J Griffin
J Paisley
J Sethuraman
J Shi
L Ren
N Foti
P Orbanz
R Unnikrishnan
S Kumar
T Ferguson
X Sun
YW Teh
Publication venue
Publication date: 02/07/2017
Field of study

Dirichlet processes (DP) are widely applied in Bayesian nonparametric modeling. However, in their basic form they do not directly integrate dependency information among data arising from space and time. In this paper, we propose location dependent Dirichlet processes (LDDP) which incorporate nonparametric Gaussian processes in the DP modeling framework to model such dependencies. We develop the LDDP in the context of mixture modeling, and develop a mean field variational inference algorithm for this mixture model. The effectiveness of the proposed modeling framework is shown on an image segmentation task

arXiv.org e-Print Archive

Crossref

Globally Continuous and Non-Markovian Crowd Activity Analysis from Videos

Author: B Zhou
C Stauffer
DM Blei
EB Sudderth
G Antonini
J Sethuraman
J Varadarajan
JC Niebles
KM Kitani
L Kaufman
MD Hoffman
N Oliver
R Emonet
S Ali
S Jain
S Zhou
X Wang
X Wang
YW Teh
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2016
Field of study

Automatically recognizing activities in video is a classic problem in vision and helps to understand behaviors, describe scenes and detect anomalies. We propose an unsupervised method for such purposes. Given video data, we discover recurring activity patterns that appear, peak, wane and disappear over time. By using non-parametric Bayesian methods, we learn coupled spatial and temporal patterns with minimum prior knowledge. To model the temporal changes of patterns, previous works compute Markovian progressions or locally continuous motifs whereas we model time in a globally continuous and non-Markovian way. Visually, the patterns depict flows of major activities. Temporally, each pattern has its own unique appearance-disappearance cycles. To compute compact pattern representations, we also propose a hybrid sampling method. By combining these patterns with detailed environment information, we interpret the semantics of activities and report anomalies. Also, our method fits data better and detects anomalies that were difficult to detect previously

arXiv.org e-Print Archive

Crossref

White Rose Research Online